{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TensorFlow入门"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 119,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Hello,TensorFlow!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "b'Hello,TensorFlow!'\n"
     ]
    }
   ],
   "source": [
    "hello=tf.constant(\"Hello,TensorFlow!\")\n",
    "sess=tf.Session()\n",
    "print(sess.run(hello))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.1 张量和图\n",
    "\n",
    "TensorFlow 是一种采用数据流图（data flow graphs），用于数值计算的开源软件库。其中 **Tensor代表传递的数据为张量（多维数组）**，**Flow代表使用计算图进行运算**。数据流图用「结点」（nodes）和「边」（edges）组成的有向图来描述数学运算。**结点**一般用来表示施加的数学操作，但也可以表示数据输入的起点和输出的终点，或者是读取/写入持久变量（persistent variable）的终点。**边**表示结点之间的输入/输出关系。这些数据边可以传送维度可动态调整的多维数据数组，即张量（tensor）。\n",
    "\n",
    "下面代码是使用计算图的案例:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "8\n",
      "[[0. 0.]\n",
      " [0. 0.]]\n"
     ]
    }
   ],
   "source": [
    "a = tf.constant(2, tf.int16)\n",
    "b = tf.constant(4, tf.float32)\n",
    "\n",
    "graph = tf.Graph()\n",
    "with graph.as_default():\n",
    "    a = tf.Variable(8, tf.float32)\n",
    "    b = tf.Variable(tf.zeros([2,2], tf.float32))\n",
    "    \n",
    "with tf.Session(graph=graph) as session:\n",
    "    tf.global_variables_initializer().run()\n",
    "    print(session.run(a))\n",
    "    print(session.run(b))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在 Tensorflow 中，**所有不同的变量和运算都是储存在计算图**。所以在我们构建完模型所需要的图之后，还**需要打开一个会话（Session）来运行整个计算图**。在会话中，我们可以将所有计算分配到可用的 CPU 和 GPU 资源中。\n",
    "\n",
    "如下所示代码，我们声明两个常量 a 和 b，并且定义一个加法运算。但它并不会输出计算结果，因为我们只是定义了一张图，而没有运行它:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tensor(\"add:0\", shape=(2,), dtype=int32)\n"
     ]
    }
   ],
   "source": [
    "a=tf.constant([1,2],name=\"a\")\n",
    "\n",
    "b=tf.constant([2,4],name=\"b\")\n",
    "\n",
    "result = a+b\n",
    "\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面的代码才会输出计算结果，因为我们需要创建一个会话才能管理 TensorFlow 运行时的所有资源。但计算完毕后需要关闭会话来帮助系统回收资源，不然就会出现资源泄漏的问题。下面提供了使用会话的两种方式："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[3 6]\n"
     ]
    }
   ],
   "source": [
    "a=tf.constant([1,2])\n",
    "\n",
    "b=tf.constant([2,4])\n",
    "\n",
    "result = a+b\n",
    "\n",
    "'''\n",
    "sess=tf.Session()\n",
    "print(sess.run(result))\n",
    "sess.close\n",
    "'''\n",
    "\n",
    "with tf.Session() as sess:\n",
    "    print(sess.run(result))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.2 常量、变量和占位符\n",
    "\n",
    "**TensorFlow 中最基本的单位是常量（constant）、变量（Variable）和占位符（placeholder）。**\n",
    "\n",
    "下面我们分别定义了常量与变量："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2 \n",
      " [[0. 0.]\n",
      " [0. 0.]] \n",
      " [0 0 0 0 0 0 0 0 0 0 0]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "a=tf.constant(2,tf.int16)\n",
    "b=tf.constant(8.9,tf.float32)\n",
    "\n",
    "d=tf.Variable(4,tf.int16)\n",
    "\n",
    "g = tf.constant(np.zeros(shape=(2,2), dtype=np.float32))\n",
    "# 等价于 g=tf.zeros([2,2],tf.float32)\n",
    "\n",
    "h = tf.zeros([11], tf.int16)\n",
    "i = tf.ones([2,2], tf.float32)\n",
    "l = tf.Variable(tf.zeros([5,6,5], tf.float32))\n",
    "\n",
    "# print(a,'\\n',d,'\\n',g,'\\n',i,'\\n',h,'\\n',l)\n",
    "with tf.Session() as sess:\n",
    "    print(sess.run(a),'\\n',sess.run(g),'\\n',sess.run(h))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TensorFlow数据类型可参考：http://wiki.jikexueyuan.com/project/tensorflow-zh/resources/dims_types.html\n",
    "\n",
    "**常量(constant)**同其他语言的常量, 在赋值后不可修改。tf.constant(\n",
    "    value,\n",
    "    dtype=None,\n",
    "    shape=None,\n",
    "    name='Const',\n",
    "    verify_shape=False\n",
    ")\n",
    "<br /> **占位符(placeholder)**同其他语言的方法的参数, 在执行方法时设置。tf.placeholder(\n",
    "    dtype,\n",
    "    shape=None,\n",
    "    name=None\n",
    ")\n",
    "<br /> **变量(Variable)**比较特殊, 机器学习特有属性. 对于常规的算法而言, 在输入值已知的情况下, 优化变量(参数)使损失函数的值达到最小. 因而在含有优化器(Optimizer)的算法中, 变量是动态计算(变化)的. 如果在未使用优化器时, 变量仍作为普通变量. 注意: 在使用变量之前, 需要执行初始化方法, 系统才会为其赋值。tf.Variable('initial-value', name='optional-name')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tensor(\"Const_116:0\", shape=(), dtype=float32) Tensor(\"Const_117:0\", shape=(), dtype=float32)\n",
      "7.5\n",
      "[3. 7.]\n",
      "linear_model:  [0.         0.3        0.6        0.90000004]\n"
     ]
    }
   ],
   "source": [
    "with tf.Session() as sess:\n",
    "    # 常量\n",
    "    node1 = tf.constant(3.0, tf.float32)\n",
    "    node2 = tf.constant(4.0)\n",
    "    print (node1, node2)  # 只打印结点信息\n",
    "\n",
    "    # 占位符\n",
    "    a = tf.placeholder(tf.float32)\n",
    "    b = tf.placeholder(tf.float32)\n",
    "    adder_node = a + b  # 与调用add方法类似\n",
    "    print (sess.run(adder_node, {a: 3, b: 4.5}))\n",
    "    print (sess.run(adder_node, {a: [1, 3], b: [2, 4]}))\n",
    "\n",
    "    # 变量\n",
    "    W = tf.Variable([.3], tf.float32)\n",
    "    b = tf.Variable([-.3], tf.float32)\n",
    "    x = tf.placeholder(tf.float32)\n",
    "    # linear_model = node1 * x + node2\n",
    "    linear_model = W * x + b\n",
    "    sess.run(tf.global_variables_initializer()) #初始化模型参数\n",
    "    print (\"linear_model: \", sess.run(linear_model, {x: [1, 2, 3, 4]}))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在会话中，占位符可以使用 feed_dict 馈送数据。\n",
    "\n",
    "feed_dict 是一个字典，在字典中需要给出每一个用到的占位符的取值。在训练神经网络时需要每次提供一个批量的训练样本，如果每次迭代选取的数据要通过常量表示，那么 TensorFlow 的计算图会非常大。因为每增加一个常量，TensorFlow 都会在计算图中增加一个结点。所以说拥有几百万次迭代的神经网络会拥有极其庞大的计算图，而**占位符却可以解决这一点，它只会拥有占位符这一个结点**。\n",
    "\n",
    "下面一段代码分别展示了使用常量和占位符进行计算："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.11131823  2.3845987 ]]\n",
      "[[-0.11131823  2.3845987 ]]\n",
      "[[2. 2.]]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<bound method BaseSession.close of <tensorflow.python.client.session.Session object at 0x000001F663C4BEF0>>"
      ]
     },
     "execution_count": 92,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "w1=tf.Variable(tf.random_normal([1,2],stddev=1,seed=1))\n",
    "\n",
    "#因为需要重复输入x，而每建一个x就会生成一个结点，计算图的效率会低。所以使用占位符\n",
    "x=tf.placeholder(tf.float32,shape=(1,2))\n",
    "x1=tf.constant([[0.7,0.9]])\n",
    "\n",
    "a=x+w1\n",
    "b=x1+w1\n",
    "\n",
    "sess=tf.Session()\n",
    "sess.run(tf.global_variables_initializer())\n",
    "\n",
    "#运行y时将占位符填上，feed_dict为字典，变量名不可变\n",
    "y_1=sess.run(a,feed_dict={x:[[0.7,0.9]]})\n",
    "y_2=sess.run(b)\n",
    "\n",
    "print(y_1)\n",
    "print(y_2)\n",
    "sess.close"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.3 实例\n",
    "例1：计算多维张量欧里几得距离"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "the distance between [[1 2]] and [[15 16]] -> 19.79899024963379\n",
      "the distance between [[3 4]] and [[13 14]] -> 14.142135620117188\n",
      "the distance between [[5 6]] and [[11 12]] -> 8.485280990600586\n",
      "the distance between [[7 8]] and [[ 9 10]] -> 2.8284270763397217\n"
     ]
    }
   ],
   "source": [
    "list_of_points1_ = [[1,2], [3,4], [5,6], [7,8]]\n",
    "list_of_points2_ = [[15,16], [13,14], [11,12], [9,10]]\n",
    "\n",
    "# list_of_points1 = np.array([np.array(elem).reshape(1,2) for elem in list_of_points1_])\n",
    "# list_of_points2 = np.array([np.array(elem).reshape(1,2) for elem in list_of_points2_])\n",
    "\n",
    "\n",
    "#graph = tf.Graph()\n",
    "\n",
    "#with graph.as_default():   \n",
    "\n",
    "def calculate_eucledian_distance(point1, point2):\n",
    "    difference = tf.subtract(point1, point2)  # 相减\n",
    "    power2 = tf.pow(difference, tf.constant(2.0, shape=(1,2))) # 幂乘\n",
    "    add = tf.reduce_sum(power2) # 计算一个张量的各个维度上元素的总和\n",
    "    eucledian_distance = tf.sqrt(add) # 开根\n",
    "    return eucledian_distance\n",
    "\n",
    "#我们使用 tf.placeholder() 创建占位符 ，在 session.run() 过程中再投递数据 \n",
    "point1 = tf.placeholder(tf.float32, shape=(1, 2))\n",
    "point2 = tf.placeholder(tf.float32, shape=(1, 2))\n",
    "\n",
    "dist = calculate_eucledian_distance(point1, point2)\n",
    "\n",
    "# 开启会话\n",
    "with tf.Session() as session:\n",
    "    #tf.global_variables_initializer().run()   \n",
    "    for ii in range(len(list_of_points1)):\n",
    "        point1_ = list_of_points1[ii]\n",
    "        point2_ = list_of_points2[ii]\n",
    "\n",
    "        #使用feed_dict将数据投入到dist中\n",
    "        distance = session.run(dist, feed_dict={point1 : point1_, point2 : point2_})\n",
    "        print(\"the distance between {} and {} -> {}\".format(point1_, point2_, distance))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "例2：构建三层全连接神经网络"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 134,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "初始化权重为：\n",
      " [[-0.8113182   1.4845988   0.06532937]\n",
      " [-2.4427042   0.0992484   0.5912243 ]] \n",
      " [[-0.8113182 ]\n",
      " [ 1.4845988 ]\n",
      " [ 0.06532937]]\n",
      "在迭代0次后，训练损失为0.308504\n",
      "在迭代1000次后，训练损失为0.0393406\n",
      "在迭代2000次后，训练损失为0.0182158\n",
      "在迭代3000次后，训练损失为0.0104779\n",
      "在迭代4000次后，训练损失为0.00680374\n",
      "在迭代5000次后，训练损失为0.00446512\n",
      "在迭代6000次后，训练损失为0.00296797\n",
      "在迭代7000次后，训练损失为0.00218553\n",
      "在迭代8000次后，训练损失为0.00179452\n",
      "在迭代9000次后，训练损失为0.0013211\n",
      "在迭代10000次后，训练损失为0.000957699\n"
     ]
    }
   ],
   "source": [
    "# 定义变量w1,w2（权重）\n",
    "w1=tf.Variable(tf.random_normal([2,3],stddev=1,seed=1))\n",
    "w2=tf.Variable(tf.random_normal([3,1],stddev=1,seed=1))\n",
    "\n",
    "# 定义占位符x,y（样本集）\n",
    "x=tf.placeholder(tf.float32,shape=(None,2)) # None可以根据batch大小确定维度，在shape的一个维度上使用None\n",
    "y=tf.placeholder(tf.float32,shape=(None,1))\n",
    "\n",
    "# 定义ReLU激活函数\n",
    "a=tf.nn.relu(tf.matmul(x,w1)) #tf.nn.relu(features, name = None),这个函数的作用是计算激活函数relu，即max(features, 0)。即将矩阵中每个元素的负值置0。\n",
    "yhat=tf.nn.relu(tf.matmul(a,w2)) # tf.matmul为矩阵相乘,yhat为预测的值\n",
    "\n",
    "# 定义交叉熵损失函数和训练算法AdamOptimizer\n",
    "cross_entropy=-tf.reduce_mean(y*tf.log(tf.clip_by_value(yhat,1e-10,1.0))) #tf.clip_by_value(A, min, max)：输入一个张量A，把A中的每一个元素的值都压缩在min和max之间。小于min的让它等于min，大于max的元素的值等于max。\n",
    "train_op=tf.train.AdamOptimizer(0.001).minimize(cross_entropy) # 学习率为0.001\n",
    "\n",
    "# 随机生成512个样本，样本特征维数为2\n",
    "data_size=512\n",
    "X = np.random.RandomState(1).rand(data_size,2) # 样本范围为[0, 1)\n",
    "# 生成标签，1为正样本,0为负样本\n",
    "Y = [[int(x1+x2<1)] for (x1,x2) in X]\n",
    "\n",
    "batch_size=10 # 每次训练读取样本个数\n",
    "\n",
    "with tf.Session() as sess:\n",
    "    sess.run(tf.global_variables_initializer()) # 初始化\n",
    "    print('初始化权重为：\\n',sess.run(w1),'\\n',sess.run(w2))\n",
    "    steps=10001\n",
    "    for i in range(steps):\n",
    "        #选定每一个批量读取的首尾位置，确保在1个epoch（全部样本训练一次为1个epoch）内采样训练\n",
    "        start = i*batch_size % data_size\n",
    "        end = min(start+batch_size,data_size)\n",
    "        sess.run(train_op,feed_dict={x:X[start:end],y:Y[start:end]}) # 开始训练\n",
    "        if i % 1000 == 0:\n",
    "            training_loss=sess.run(cross_entropy,feed_dict={x:X,y:Y})\n",
    "            print(\"在迭代%d次后，训练损失为%g\"%(i,training_loss))\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "说明：上面的代码定义了一个简单的三层全连接网络（输入层、隐藏层和输出层分别为 2、3 和 1 个神经元），隐藏层和输出层的激活函数使用的是 ReLU 函数。该模型训练的样本总数为 512，每次迭代读取的批量为 10。这个简单的全连接网络以交叉熵为损失函数，并使用Adam优化算法进行权重更新。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
